import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

Roboflow Notebooks

Fine-tuning Florence-2 on Object Detection Dataset


Roboflow arXiv

Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license. The model demonstrates strong zero-shot and fine-tuning capabilities across tasks such as captioning, object detection, grounding, and segmentation.

Florence-2 Figure.1

Figure 1. Illustration showing the level of spatial hierarchy and semantic granularity expressed by each task. Source: Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.

The model takes images and task prompts as input, generating the desired results in text format. It uses a DaViT vision encoder to convert images into visual token embeddings. These are then concatenated with BERT-generated text embeddings and processed by a transformer-based multi-modal encoder-decoder to generate the response.

Florence-2 Figure.2

Figure 2. Overview of Florence-2 architecture. Source: Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.

Setup

Configure your API keys

To fine-tune Florence-2, you need to provide your HuggingFace Token and Roboflow API key. Follow these steps:

  • Open your HuggingFace Settings page. Click Access Tokens then New Token to generate new token.
  • Go to your Roboflow Settings page. Click Copy. This will place your private key in the clipboard.
  • In Colab, go to the left pane and click on Secrets (🔑).
    • Store HuggingFace Access Token under the name HF_TOKEN.
    • Store Roboflow API Key under the name ROBOFLOW_API_KEY.

Select the runtime

Let’s make sure that we have access to GPU. We can use nvidia-smi command to do that. In case of any problems navigate to Edit -> Notebook settings -> Hardware accelerator, set it to L4 GPU, and then click Save.

!nvidia-smi
Sun Feb 16 17:48:15 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   35C    P0             64W /  500W |       5MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00000000:41:00.0 Off |                    0 |
| N/A   40C    P0             78W /  500W |   12139MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-80GB          On  |   00000000:81:00.0 Off |                    0 |
| N/A   49C    P0            336W /  500W |   19831MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-80GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   34C    P0             62W /  500W |       5MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   2616743      C   ...onda3/envs/shataxi_space/bin/python      12128MiB |
|    2   N/A  N/A   2627487      C   ...naconda3/envs/zeel_py310/bin/python      19820MiB |
+-----------------------------------------------------------------------------------------+

Download example data

NOTE: Feel free to replace our example image with your own photo.

!wget -q https://media.roboflow.com/notebooks/examples/dog.jpeg
!ls -lh
total 12M
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.1
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.2
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.3
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.4
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.5
-rw-rw-r-- 1 patel_zeel patel_zeel 104K Jun  2  2023  dog.jpeg.6
-rw-rw-r-- 1 patel_zeel patel_zeel 3.0M Feb 16 17:42 'how-to-finetune-florence-2-on-detection-dataset copy 2.ipynb'
-rw-rw-r-- 1 patel_zeel patel_zeel 3.0M Feb 16 17:42 'how-to-finetune-florence-2-on-detection-dataset copy.ipynb'
-rw-rw-r-- 1 patel_zeel patel_zeel 2.6M Feb 16 17:48  how-to-finetune-florence-2-on-detection-dataset.ipynb
drwxrwxr-x 4 patel_zeel patel_zeel 4.0K Feb 16 17:34  model_checkpoints
drwxrwxr-x 5 patel_zeel patel_zeel 4.0K Feb 16 17:29  poker-cards-4
-rw-rw-r-- 1 patel_zeel patel_zeel 2.5M Jan 22 10:51  scratchpad.ipynb
EXAMPLE_IMAGE_PATH = "dog.jpeg"

Download and configure the model

Let’s download the model checkpoint and configure it so that you can fine-tune it later on.

# !pip install -q transformers flash_attn timm einops peft
# !pip install -q roboflow git+https://github.com/roboflow/supervision.git
# @title Imports

import io
import os
import re
import json
import torch
import html
import base64
import itertools

import numpy as np
import supervision as sv

# from google.colab import userdata
from IPython.core.display import display, HTML
from torch.utils.data import Dataset, DataLoader
from transformers import (
    AdamW,
    AutoModelForCausalLM,
    AutoProcessor,
    get_scheduler
)
from tqdm import tqdm
from typing import List, Dict, Any, Tuple, Generator
from peft import LoraConfig, get_peft_model
from PIL import Image
from roboflow import Roboflow
DeprecationWarning: Importing display from IPython.core.display is deprecated since IPython 7.14, please import from IPython.display

Load the model using AutoModelForCausalLM and the processor using AutoProcessor classes from the transformers library. Note that you need to pass trust_remote_code as True since this model is not a standard transformers model.

CHECKPOINT = "microsoft/Florence-2-base-ft"
# REVISION = 'refs/pr/6'
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = AutoModelForCausalLM.from_pretrained(CHECKPOINT, trust_remote_code=True).to(DEVICE)
processor = AutoProcessor.from_pretrained(CHECKPOINT, trust_remote_code=True)
Importing from timm.models.layers is deprecated, please import via timm.layers
Florence2LanguageForConditionalGeneration has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.

Run inference with pre-trained Florence-2 model

# @title Example object detection inference

image = Image.open(EXAMPLE_IMAGE_PATH)
task = "<OD>"
text = "<OD>"

inputs = processor(text=text, images=image, return_tensors="pt").to(DEVICE)
generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
response = processor.post_process_generation(generated_text, task=task, image_size=(image.width, image.height))
detections = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, response, resolution_wh=image.size)

bounding_box_annotator = sv.BoundingBoxAnnotator(color_lookup=sv.ColorLookup.INDEX)
label_annotator = sv.LabelAnnotator(color_lookup=sv.ColorLookup.INDEX)

image = bounding_box_annotator.annotate(image, detections)
image = label_annotator.annotate(image, detections)
image.thumbnail((600, 600))
image
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

# @title Example image captioning inference

image = Image.open(EXAMPLE_IMAGE_PATH)
task = "<DETAILED_CAPTION>"
text = "<DETAILED_CAPTION>"

inputs = processor(text=text, images=image, return_tensors="pt").to(DEVICE)
generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
response = processor.post_process_generation(generated_text, task=task, image_size=(image.width, image.height))
response
{'<DETAILED_CAPTION>': 'In this image we can see a person wearing a bag and holding a dog. In the background there are buildings, poles and sky with clouds.'}
# @title Example caption to phrase grounding inference

image = Image.open(EXAMPLE_IMAGE_PATH)
task = "<CAPTION_TO_PHRASE_GROUNDING>"
text = "<CAPTION_TO_PHRASE_GROUNDING> Vehicle"

inputs = processor(text=text, images=image, return_tensors="pt").to(DEVICE)
generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
response = processor.post_process_generation(generated_text, task=task, image_size=(image.width, image.height))
detections = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, response, resolution_wh=image.size)

bounding_box_annotator = sv.BoundingBoxAnnotator(color_lookup=sv.ColorLookup.INDEX)
label_annotator = sv.LabelAnnotator(color_lookup=sv.ColorLookup.INDEX)

image = bounding_box_annotator.annotate(image, detections)
image = label_annotator.annotate(image, detections)
image.thumbnail((600, 600))
image
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

Fine-tune Florence-2 on custom dataset

Download dataset from Roboflow Universe

ROBOFLOW_API_KEY = os.getenv("ROBOFLOW_API_KEY")
rf = Roboflow(api_key=ROBOFLOW_API_KEY)

project = rf.workspace("roboflow-jvuqo").project("poker-cards-fmjio")
version = project.version(4)
dataset = version.download("florence2-od")
loading Roboflow workspace...
loading Roboflow project...
!head -n 5 {dataset.location}/train/annotations.jsonl
{"image":"IMG_20220316_172418_jpg.rf.e3cb4a86dc0247e71e3697aa3e9db923.jpg","prefix":"<OD>","suffix":"9 of clubs<loc_138><loc_100><loc_470><loc_448>10 of clubs<loc_388><loc_145><loc_670><loc_453>jack  of clubs<loc_566><loc_166><loc_823><loc_432>queen of clubs<loc_365><loc_465><loc_765><loc_999>king of clubs<loc_601><loc_440><loc_949><loc_873>"}
{"image":"IMG_20220316_171515_jpg.rf.e3b1932bb375b3b3912027647586daa8.jpg","prefix":"<OD>","suffix":"5 of clubs<loc_554><loc_2><loc_763><loc_467>6 of clubs<loc_399><loc_79><loc_555><loc_466>7 of clubs<loc_363><loc_484><loc_552><loc_905>8 of clubs<loc_535><loc_449><loc_757><loc_971>"}
{"image":"IMG_20220316_165139_jpg.rf.e30257ec169a2bfdfecb693211d37250.jpg","prefix":"<OD>","suffix":"9 of diamonds<loc_596><loc_535><loc_859><loc_982>jack of diamonds<loc_211><loc_546><loc_411><loc_880>queen of diamonds<loc_430><loc_34><loc_692><loc_518>king of diamonds<loc_223><loc_96><loc_451><loc_523>10 of diamonds<loc_387><loc_542><loc_604><loc_925>"}
{"image":"IMG_20220316_143407_jpg.rf.e1eb3be3efc6c3bbede436cfb5489e7c.jpg","prefix":"<OD>","suffix":"ace of hearts<loc_345><loc_315><loc_582><loc_721>2 of hearts<loc_709><loc_115><loc_888><loc_509>3 of hearts<loc_529><loc_228><loc_735><loc_613>4 of hearts<loc_98><loc_421><loc_415><loc_845>"}
{"image":"IMG_20220316_165139_jpg.rf.e4c229a9128494d17992cbe88af575df.jpg","prefix":"<OD>","suffix":"9 of diamonds<loc_141><loc_18><loc_404><loc_465>jack of diamonds<loc_589><loc_120><loc_789><loc_454>queen of diamonds<loc_308><loc_482><loc_570><loc_966>king of diamonds<loc_549><loc_477><loc_777><loc_904>10 of diamonds<loc_396><loc_75><loc_613><loc_458>"}
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using `tokenizers` before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
# # read jsonl file
# def read_jsonl(file_path: str) -> Generator[Dict[str, Any], None, None]:
#     with open(file_path, "r") as f:
#         for line in f:
#             yield json.loads(line)

# lines = []
# split = "test"     
# for line in read_jsonl(dataset.location + f"/{split}/annotations.jsonl"):
#     # print(line)
#     # edit = True
#     # copied_line = list(line['suffix'])
#     # for i in range(len(copied_line)):
#     #     if copied_line[i] == "<":
#     #         edit = False
#     #     elif copied_line[i] == ">":
#     #         edit = True
#     #     else:
#     #         if edit:
#     #             copied_line[i] = chr(ord(copied_line[i]) + 1)
#     # copied_line = "".join(copied_line)
#     # line['suffix'] = copied_line
    
#     line['suffix'] = line['suffix'].replace("club", "dog").replace("diamond", "cat").replace("heart", "bird").replace("spade", "fish")
#     print(line)
#     lines.append(line)

# with open(dataset.location + f"/{split}/annotations.jsonl", "w") as f:
#     for line in lines:
#         f.write(json.dumps(line) + "\n")
# @title Define `DetectionsDataset` class

class JSONLDataset:
    def __init__(self, jsonl_file_path: str, image_directory_path: str):
        self.jsonl_file_path = jsonl_file_path
        self.image_directory_path = image_directory_path
        self.entries = self._load_entries()

    def _load_entries(self) -> List[Dict[str, Any]]:
        entries = []
        with open(self.jsonl_file_path, 'r') as file:
            for line in file:
                data = json.loads(line)
                entries.append(data)
        return entries

    def __len__(self) -> int:
        return len(self.entries)

    def __getitem__(self, idx: int) -> Tuple[Image.Image, Dict[str, Any]]:
        if idx < 0 or idx >= len(self.entries):
            raise IndexError("Index out of range")

        entry = self.entries[idx]
        image_path = os.path.join(self.image_directory_path, entry['image'])
        try:
            image = Image.open(image_path)
            return (image, entry)
        except FileNotFoundError:
            raise FileNotFoundError(f"Image file {image_path} not found.")


class DetectionDataset(Dataset):
    def __init__(self, jsonl_file_path: str, image_directory_path: str):
        self.dataset = JSONLDataset(jsonl_file_path, image_directory_path)

    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, idx):
        image, data = self.dataset[idx]
        prefix = data['prefix']
        suffix = data['suffix']
        return prefix, suffix, image
# @title Initiate `DetectionsDataset` and `DataLoader` for train and validation subsets

BATCH_SIZE = 6
NUM_WORKERS = 0

def collate_fn(batch):
    questions, answers, images = zip(*batch)
    inputs = processor(text=list(questions), images=list(images), return_tensors="pt", padding=True).to(DEVICE)
    return inputs, answers

train_dataset = DetectionDataset(
    jsonl_file_path = f"{dataset.location}/train/annotations.jsonl",
    image_directory_path = f"{dataset.location}/train/"
)
val_dataset = DetectionDataset(
    jsonl_file_path = f"{dataset.location}/valid/annotations.jsonl",
    image_directory_path = f"{dataset.location}/valid/"
)

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, collate_fn=collate_fn, num_workers=NUM_WORKERS, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, collate_fn=collate_fn, num_workers=NUM_WORKERS)
# @title Setup LoRA Florence-2 model

# config = LoraConfig(
#     r=8,
#     lora_alpha=8,
#     target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
#     task_type="CAUSAL_LM",
#     lora_dropout=0.05,
#     bias="none",
#     inference_mode=False,
#     use_rslora=True,
#     init_lora_weights="gaussian",
# )
config = LoraConfig(
        r=8,
        lora_alpha=16,
        target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
        task_type="CAUSAL_LM",
        lora_dropout=0.05,
        bias="none",
        init_lora_weights="gaussian",
)

peft_model = get_peft_model(model, config)
peft_model.print_trainable_parameters()
trainable params: 1,929,928 || all params: 272,733,896 || trainable%: 0.7076
torch.cuda.empty_cache()
# @title Run inference with pre-trained Florence-2 model on validation dataset

def render_inline(image: Image.Image, resize=(128, 128)):
    """Convert image into inline html."""
    image.resize(resize)
    with io.BytesIO() as buffer:
        image.save(buffer, format='jpeg')
        image_b64 = str(base64.b64encode(buffer.getvalue()), "utf-8")
        return f"data:image/jpeg;base64,{image_b64}"


def render_example(image: Image.Image, response):
    try:
        detections = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, response, resolution_wh=image.size)
        image = sv.BoundingBoxAnnotator(color_lookup=sv.ColorLookup.INDEX).annotate(image.copy(), detections)
        image = sv.LabelAnnotator(color_lookup=sv.ColorLookup.INDEX).annotate(image, detections)
    except:
        print('failed to redner model response')
    return f"""
<div style="display: inline-flex; align-items: center; justify-content: center;">
    <img style="width:256px; height:256px;" src="{render_inline(image, resize=(128, 128))}" />
    <p style="width:512px; margin:10px; font-size:small;">{html.escape(json.dumps(response))}</p>
</div>
"""


def render_inference_results(model, dataset: DetectionDataset, count: int):
    html_out = ""
    count = min(count, len(dataset))
    for i in range(count):
        image, data = dataset.dataset[i]
        prefix = data['prefix']
        suffix = data['suffix']
        inputs = processor(text=prefix, images=image, return_tensors="pt").to(DEVICE)
        generated_ids = model.generate(
            input_ids=inputs["input_ids"],
            pixel_values=inputs["pixel_values"],
            max_new_tokens=1024,
            num_beams=3
        )
        generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
        answer = processor.post_process_generation(generated_text, task='<OD>', image_size=image.size)
        html_out += render_example(image, answer)

    display(HTML(html_out))

render_inference_results(peft_model, val_dataset, 4)
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["bed"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["table"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["chair"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["furniture"]}}

Fine-tune Florence-2 on custom object detection dataset

# @title Define train loop

def train_model(train_loader, val_loader, model, processor, epochs=10, lr=1e-6):
    optimizer = AdamW(model.parameters(), lr=lr)
    num_training_steps = epochs * len(train_loader)
    lr_scheduler = get_scheduler(
        name="linear",
        optimizer=optimizer,
        num_warmup_steps=0,
        num_training_steps=num_training_steps,
    )

    render_inference_results(peft_model, val_loader.dataset, 6)

    for epoch in range(epochs):
        model.train()
        train_loss = 0
        for inputs, answers in tqdm(train_loader, desc=f"Training Epoch {epoch + 1}/{epochs}"):

            input_ids = inputs["input_ids"]
            pixel_values = inputs["pixel_values"]
            labels = processor.tokenizer(
                text=answers,
                return_tensors="pt",
                padding=True,
                return_token_type_ids=False
            ).input_ids.to(DEVICE)

            outputs = model(input_ids=input_ids, pixel_values=pixel_values, labels=labels)
            loss = outputs.loss

            loss.backward(), optimizer.step(), lr_scheduler.step(), optimizer.zero_grad()
            train_loss += loss.item()

        avg_train_loss = train_loss / len(train_loader)
        print(f"Average Training Loss: {avg_train_loss}")

        model.eval()
        val_loss = 0
        with torch.no_grad():
            for inputs, answers in tqdm(val_loader, desc=f"Validation Epoch {epoch + 1}/{epochs}"):

                input_ids = inputs["input_ids"]
                pixel_values = inputs["pixel_values"]
                labels = processor.tokenizer(
                    text=answers,
                    return_tensors="pt",
                    padding=True,
                    return_token_type_ids=False
                ).input_ids.to(DEVICE)

                outputs = model(input_ids=input_ids, pixel_values=pixel_values, labels=labels)
                loss = outputs.loss

                val_loss += loss.item()

            avg_val_loss = val_loss / len(val_loader)
            print(f"Average Validation Loss: {avg_val_loss}")

            render_inference_results(peft_model, val_loader.dataset, 6)

        output_dir = f"./model_checkpoints/epoch_{epoch+1}"
        os.makedirs(output_dir, exist_ok=True)
        model.save_pretrained(output_dir)
        processor.save_pretrained(output_dir)
%%time

EPOCHS = 10
LR = 5e-6

train_model(train_loader, val_loader, peft_model, processor, epochs=EPOCHS, lr=LR)
This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["bed"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["table"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["chair"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["furniture"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["bed"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["tablecloth"]}}

Training Epoch 1/10: 100%|██████████| 136/136 [01:40<00:00,  1.36it/s]
Average Training Loss: 5.220192882944556
Validation Epoch 1/10: 100%|██████████| 8/8 [00:03<00:00,  2.41it/s]
Average Validation Loss: 3.9150948226451874
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["bed"]}}

{"<OD>": {"bboxes": [[198.0800018310547, 175.0399932861328, 487.3599853515625, 496.3199768066406]], "labels": ["playing card"]}}

{"<OD>": {"bboxes": [[160.95999145507812, 210.87998962402344, 182.0800018310547, 248.0], [322.239990234375, 212.1599884033203, 344.0, 243.51998901367188]], "labels": ["human face", "human face"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["furniture"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["blanket"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 0.3199999928474426, 639.0399780273438, 639.0399780273438]], "labels": ["dining table"]}}

Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
Training Epoch 2/10: 100%|██████████| 136/136 [01:40<00:00,  1.35it/s]
Average Training Loss: 3.782239447621738
Validation Epoch 2/10: 100%|██████████| 8/8 [00:03<00:00,  2.42it/s]
Average Validation Loss: 3.1201466619968414
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.95999908447266, 512.3200073242188, 357.44000244140625], [164.8000030517578, 330.55999755859375, 301.7599792480469, 585.2799682617188], [173.1199951171875, 14.399999618530273, 303.67999267578125, 253.1199951171875], [372.79998779296875, 112.95999908447266, 512.9599609375, 357.44000244140625], [309.44000244140625, 360.0, 447.03997802734375, 616.6400146484375], [52.79999923706055, 239.0399932861328, 166.0800018310547, 470.0799865722656]], "labels": ["queen of spades", "king of spade", "queen spades", "queen card", "queen's spades", "queen clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.8399963378906, 411.8399963378906], [198.72000122070312, 82.23999786376953, 381.1199951171875, 324.1600036621094], [333.1199951171875, 42.55999755859375, 516.7999877929688, 207.0399932861328]], "labels": ["6 of hearts", "7 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[368.3199768066406, 234.55999755859375, 518.0800170898438, 490.55999755859375]], "labels": ["queen of spades"]}}

{"<OD>": {"bboxes": [[56.0, 228.79998779296875, 331.1999816894531, 639.0399780273438], [436.79998779296875, 157.1199951171875, 557.1199951171875, 392.0], [297.91998291015625, 252.47999572753906, 459.1999816894531, 550.0800170898438], [330.55999755859375, 150.0800018310547, 479.67999267578125, 447.03997802734375]], "labels": ["8 of spades", "6 of spade", "7 of clubs", "6 of clubs"]}}

{"<OD>": {"bboxes": [[15.039999961853027, 254.39999389648438, 213.44000244140625, 464.9599914550781], [208.95999145507812, 285.7599792480469, 345.2799987792969, 461.7599792480469], [327.3599853515625, 191.67999267578125, 466.8799743652344, 397.7599792480469]], "labels": ["queen of spades", "7 of hearts", "6 of spade"]}}

{"<OD>": {"bboxes": [[294.0799865722656, 176.3199920654297, 624.3200073242188, 399.03997802734375], [11.199999809265137, 228.1599884033203, 274.8800048828125, 427.8399963378906], [96.95999908447266, 432.9599914550781, 314.55999755859375, 564.7999877929688], [309.44000244140625, 423.3599853515625, 548.1599731445312, 562.239990234375]], "labels": ["9 of clubs", "9 of hearts", "9 of diamonds", "9 of spades"]}}

Training Epoch 3/10: 100%|██████████| 136/136 [01:39<00:00,  1.36it/s]
Average Training Loss: 3.214196660939385
Validation Epoch 3/10: 100%|██████████| 8/8 [00:03<00:00,  2.45it/s]
Average Validation Loss: 2.605478435754776
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 357.44000244140625], [164.16000366210938, 330.55999755859375, 301.1199951171875, 585.2799682617188], [53.439998626708984, 239.0399932861328, 166.0800018310547, 469.44000244140625]], "labels": ["queen of spades", "king of spade", "9 of hearts"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.8399963378906, 411.8399963378906], [200.63999938964844, 82.23999786376953, 381.1199951171875, 323.5199890136719], [330.55999755859375, 41.91999816894531, 517.4400024414062, 207.0399932861328]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs"]}}

{"<OD>": {"bboxes": [[368.3199768066406, 234.55999755859375, 518.0800170898438, 490.55999755859375], [186.55999755859375, 273.6000061035156, 396.47998046875, 511.67999267578125], [86.08000183105469, 163.51998901367188, 255.67999267578125, 402.8800048828125]], "labels": ["queen of spades", "queen spades", "queen card"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.75999450683594, 556.47998046875, 391.3599853515625], [298.55999755859375, 253.75999450683594, 457.91998291015625, 549.4400024414062], [333.1199951171875, 151.36000061035156, 479.03997802734375, 447.03997802734375]], "labels": ["6 of spades", "7 of spade", "5 of clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 466.8799743652344, 397.7599792480469], [208.95999145507812, 285.7599792480469, 345.2799987792969, 461.7599792480469], [463.67999267578125, 221.1199951171875, 635.8399658203125, 406.0799865722656]], "labels": ["6 of hearts", "7 of hearts", "9 of hearts"]}}

{"<OD>": {"bboxes": [[10.559999465942383, 227.51998901367188, 275.5199890136719, 428.47998046875], [98.23999786376953, 432.9599914550781, 314.55999755859375, 564.1599731445312], [310.7200012207031, 424.0, 548.1599731445312, 562.239990234375]], "labels": ["2 of spades", "5 of spade", "6 of spoons"]}}

Training Epoch 4/10: 100%|██████████| 136/136 [02:07<00:00,  1.07it/s]
Average Training Loss: 2.6999165170332966
Validation Epoch 4/10: 100%|██████████| 8/8 [00:03<00:00,  2.52it/s]
Average Validation Loss: 2.0440672636032104
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 357.44000244140625], [162.87998962402344, 330.55999755859375, 301.1199951171875, 585.2799682617188], [53.439998626708984, 239.0399932861328, 166.0800018310547, 469.44000244140625]], "labels": ["queen of spades", "king of spade", "9 of spoons"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 82.23999786376953, 381.1199951171875, 323.5199890136719], [330.55999755859375, 41.91999816894531, 516.7999877929688, 207.0399932861328]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs"]}}

{"<OD>": {"bboxes": [[368.3199768066406, 234.55999755859375, 518.0800170898438, 491.1999816894531], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 403.5199890136719]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of clubs"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.75999450683594, 556.47998046875, 391.3599853515625], [298.55999755859375, 254.39999389648438, 457.2799987792969, 550.0800170898438], [333.1199951171875, 152.0, 479.03997802734375, 447.03997802734375]], "labels": ["6 of spades", "7 of spade", "5 of spoons"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 466.8799743652344, 397.7599792480469], [208.95999145507812, 285.1199951171875, 345.2799987792969, 461.7599792480469], [463.67999267578125, 221.1199951171875, 636.47998046875, 406.0799865722656], [14.399999618530273, 254.39999389648438, 214.0800018310547, 465.5999755859375]], "labels": ["6 of hearts", "7 of hearts", "9 of hearts", "8 of hearts"]}}

{"<OD>": {"bboxes": [[10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875], [98.23999786376953, 433.5999755859375, 314.55999755859375, 564.1599731445312], [310.7200012207031, 424.0, 548.1599731445312, 562.239990234375]], "labels": ["2 of spades", "5 of spade", "9 of spoons"]}}

Training Epoch 5/10: 100%|██████████| 136/136 [01:38<00:00,  1.38it/s]
Average Training Loss: 2.1655465303098453
Validation Epoch 5/10: 100%|██████████| 8/8 [00:03<00:00,  2.47it/s]
Average Validation Loss: 1.7417135536670685
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 358.0799865722656], [162.87998962402344, 330.55999755859375, 301.1199951171875, 585.2799682617188], [53.439998626708984, 239.0399932861328, 167.36000061035156, 469.44000244140625]], "labels": ["queen of spades", "king of spade", "9 of spoons"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [330.55999755859375, 42.55999755859375, 516.7999877929688, 207.0399932861328]], "labels": ["9 of clubs", "7 of clubs", "5 of clubs"]}}

{"<OD>": {"bboxes": [[368.3199768066406, 234.55999755859375, 518.0800170898438, 491.8399963378906], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 403.5199890136719]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 391.3599853515625], [298.55999755859375, 254.39999389648438, 457.2799987792969, 550.0800170898438], [333.1199951171875, 151.36000061035156, 479.03997802734375, 447.03997802734375]], "labels": ["6 of spades", "7 of spade", "5 of spoons"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [463.67999267578125, 221.1199951171875, 636.47998046875, 406.0799865722656], [14.399999618530273, 254.39999389648438, 214.0800018310547, 466.239990234375]], "labels": ["6 of hearts", "7 of hearts", "5 of hearts", "8 of hearts"]}}

{"<OD>": {"bboxes": [[98.23999786376953, 433.5999755859375, 314.55999755859375, 563.5199584960938], [310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875]], "labels": ["5 of clubs", "9 of clubs", "7 of clubs"]}}

Training Epoch 6/10: 100%|██████████| 136/136 [01:39<00:00,  1.37it/s]
Average Training Loss: 1.8535377207924337
Validation Epoch 6/10: 100%|██████████| 8/8 [00:04<00:00,  1.90it/s]
Average Validation Loss: 1.6721670180559158
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 358.0799865722656], [162.87998962402344, 330.55999755859375, 301.1199951171875, 585.2799682617188], [53.439998626708984, 239.0399932861328, 167.36000061035156, 469.44000244140625], [310.0799865722656, 360.0, 446.3999938964844, 616.6400146484375]], "labels": ["queen of spades", "king of spade", "9 of sp hearts", "10 of sp clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [330.55999755859375, 42.55999755859375, 516.7999877929688, 207.0399932861328], [197.44000244140625, 173.1199951171875, 488.6399841308594, 498.239990234375]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs", "8 of clubs"]}}

{"<OD>": {"bboxes": [[369.6000061035156, 234.55999755859375, 518.0800170898438, 491.1999816894531], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 402.8800048828125], [258.239990234375, 168.63999938964844, 388.79998779296875, 317.1199951171875]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons", "jack of spands"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 391.3599853515625], [333.1199951171875, 151.36000061035156, 479.03997802734375, 447.03997802734375], [298.55999755859375, 254.39999389648438, 457.2799987792969, 550.0800170898438]], "labels": ["6 of spades", "5 of spade", "7 of sp clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [14.399999618530273, 254.39999389648438, 214.0800018310547, 466.239990234375], [462.3999938964844, 221.1199951171875, 636.47998046875, 407.3599853515625]], "labels": ["6 of hearts", "7 of hearts", "8 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[97.5999984741211, 433.5999755859375, 314.55999755859375, 564.1599731445312], [310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875]], "labels": ["5 of clubs", "9 of clubs", "7 of clubs"]}}

Training Epoch 7/10: 100%|██████████| 136/136 [02:39<00:00,  1.17s/it]
Average Training Loss: 1.741150463328642
Validation Epoch 7/10: 100%|██████████| 8/8 [00:07<00:00,  1.09it/s]
Average Validation Loss: 1.6433503776788712
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 358.0799865722656], [162.87998962402344, 330.55999755859375, 301.1199951171875, 585.2799682617188], [53.439998626708984, 239.0399932861328, 167.36000061035156, 469.44000244140625], [310.0799865722656, 360.0, 446.3999938964844, 616.6400146484375]], "labels": ["queen of spades", "king of spade", "9 of sp hearts", "10 of sp clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [333.1199951171875, 43.20000076293945, 516.7999877929688, 207.0399932861328], [198.72000122070312, 175.0399932861328, 488.6399841308594, 497.5999755859375]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs", "8 of clubs"]}}

{"<OD>": {"bboxes": [[369.6000061035156, 234.55999755859375, 518.0800170898438, 491.8399963378906], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 402.8800048828125], [257.6000061035156, 168.63999938964844, 388.79998779296875, 317.1199951171875]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons", "jack of spands"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 391.3599853515625], [333.1199951171875, 151.36000061035156, 479.03997802734375, 447.03997802734375], [298.55999755859375, 254.39999389648438, 457.2799987792969, 550.0800170898438]], "labels": ["6 of spades", "5 of spade", "7 of sp clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [15.039999961853027, 254.39999389648438, 214.0800018310547, 465.5999755859375], [463.67999267578125, 221.1199951171875, 636.47998046875, 406.0799865722656]], "labels": ["6 of hearts", "7 of hearts", "8 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [98.23999786376953, 433.5999755859375, 314.55999755859375, 564.1599731445312], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875], [291.5199890136719, 175.0399932861328, 625.5999755859375, 401.6000061035156]], "labels": ["9 of clubs", "5 of clubs", "7 of clubs", "8 of clubs"]}}

Training Epoch 8/10: 100%|██████████| 136/136 [02:18<00:00,  1.02s/it]
Average Training Loss: 1.707651296959204
Validation Epoch 8/10: 100%|██████████| 8/8 [00:03<00:00,  2.47it/s]
Average Validation Loss: 1.6269832402467728
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 113.5999984741211, 512.3200073242188, 358.0799865722656], [161.59999084472656, 330.55999755859375, 301.1199951171875, 585.2799682617188], [52.79999923706055, 239.0399932861328, 167.36000061035156, 470.0799865722656], [310.0799865722656, 360.0, 446.3999938964844, 616.6400146484375]], "labels": ["queen of spades", "king of spade", "9 of sp hearts", "10 of sp clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [333.1199951171875, 42.55999755859375, 516.7999877929688, 207.0399932861328], [198.0800018310547, 173.1199951171875, 488.6399841308594, 497.5999755859375]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs", "8 of clubs"]}}

{"<OD>": {"bboxes": [[369.6000061035156, 234.55999755859375, 518.0800170898438, 491.8399963378906], [18.8799991607666, 289.6000061035156, 224.95999145507812, 582.0800170898438], [186.55999755859375, 273.6000061035156, 397.1199951171875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 403.5199890136719], [257.6000061035156, 168.63999938964844, 388.79998779296875, 317.1199951171875]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons", "jack of sp hearts"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 390.0799865722656], [333.1199951171875, 150.72000122070312, 479.03997802734375, 447.03997802734375], [298.55999755859375, 254.39999389648438, 457.2799987792969, 549.4400024414062]], "labels": ["6 of spades", "5 of spade", "7 of sp clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [14.399999618530273, 254.39999389648438, 214.0800018310547, 466.239990234375], [462.3999938964844, 221.1199951171875, 636.47998046875, 407.3599853515625]], "labels": ["6 of hearts", "7 of hearts", "8 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [98.23999786376953, 433.5999755859375, 314.55999755859375, 564.1599731445312], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875], [291.5199890136719, 175.0399932861328, 625.5999755859375, 401.6000061035156]], "labels": ["9 of clubs", "5 of clubs", "7 of clubs", "8 of clubs"]}}

Training Epoch 9/10: 100%|██████████| 136/136 [01:38<00:00,  1.38it/s]
Average Training Loss: 1.6793172578601276
Validation Epoch 9/10: 100%|██████████| 8/8 [00:03<00:00,  2.42it/s]
Average Validation Loss: 1.618808850646019
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 358.0799865722656], [161.59999084472656, 330.55999755859375, 301.1199951171875, 585.2799682617188], [52.79999923706055, 239.0399932861328, 167.36000061035156, 470.0799865722656], [310.0799865722656, 360.0, 446.3999938964844, 616.6400146484375]], "labels": ["queen of spades", "king of spade", "9 of sp hearts", "10 of sp clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [333.1199951171875, 42.55999755859375, 516.7999877929688, 207.0399932861328], [198.0800018310547, 173.1199951171875, 488.6399841308594, 497.5999755859375]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs", "8 of clubs"]}}

{"<OD>": {"bboxes": [[369.6000061035156, 234.55999755859375, 518.0800170898438, 491.8399963378906], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 403.5199890136719], [257.6000061035156, 168.63999938964844, 388.79998779296875, 317.1199951171875]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons", "jack of sp hearts"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 390.0799865722656], [333.1199951171875, 150.72000122070312, 479.03997802734375, 447.03997802734375], [298.55999755859375, 254.39999389648438, 457.2799987792969, 549.4400024414062]], "labels": ["6 of spades", "5 of spade", "7 of sp clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [15.039999961853027, 254.39999389648438, 214.0800018310547, 465.5999755859375], [463.67999267578125, 221.1199951171875, 636.47998046875, 406.0799865722656]], "labels": ["6 of hearts", "7 of hearts", "8 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [98.23999786376953, 433.5999755859375, 314.55999755859375, 563.5199584960938], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875], [291.5199890136719, 175.0399932861328, 625.5999755859375, 401.6000061035156]], "labels": ["9 of clubs", "5 of clubs", "7 of clubs", "8 of clubs"]}}

Training Epoch 10/10: 100%|██████████| 136/136 [01:40<00:00,  1.35it/s]
Average Training Loss: 1.6717844158411026
Validation Epoch 10/10: 100%|██████████| 8/8 [00:03<00:00,  2.47it/s]
Average Validation Loss: 1.615568920969963
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

{"<OD>": {"bboxes": [[372.79998779296875, 112.31999969482422, 512.3200073242188, 358.0799865722656], [161.59999084472656, 330.55999755859375, 301.1199951171875, 585.2799682617188], [52.79999923706055, 239.0399932861328, 167.36000061035156, 470.0799865722656], [310.0799865722656, 360.0, 446.3999938964844, 616.6400146484375]], "labels": ["queen of spades", "king of spade", "9 of sp hearts", "10 of sp clubs"]}}

{"<OD>": {"bboxes": [[0.3199999928474426, 128.3199920654297, 267.1999816894531, 411.8399963378906], [200.63999938964844, 83.5199966430664, 381.1199951171875, 323.5199890136719], [333.1199951171875, 42.55999755859375, 516.7999877929688, 207.0399932861328], [198.0800018310547, 173.1199951171875, 488.6399841308594, 497.5999755859375]], "labels": ["6 of clubs", "7 of clubs", "5 of clubs", "8 of clubs"]}}

{"<OD>": {"bboxes": [[369.6000061035156, 234.55999755859375, 518.0800170898438, 491.8399963378906], [18.8799991607666, 289.6000061035156, 223.67999267578125, 582.0800170898438], [186.55999755859375, 273.6000061035156, 396.47998046875, 512.3200073242188], [87.36000061035156, 164.16000366210938, 255.0399932861328, 403.5199890136719], [257.6000061035156, 168.63999938964844, 388.79998779296875, 317.1199951171875]], "labels": ["9 of spades", "10 of spade", "king of sp clubs", "queen of spoons", "jack of sp hearts"]}}

{"<OD>": {"bboxes": [[437.44000244140625, 157.1199951171875, 556.47998046875, 390.0799865722656], [333.1199951171875, 150.72000122070312, 479.03997802734375, 447.03997802734375], [298.55999755859375, 254.39999389648438, 457.2799987792969, 549.4400024414062]], "labels": ["6 of spades", "5 of spade", "7 of sp clubs"]}}

{"<OD>": {"bboxes": [[328.6399841308594, 192.3199920654297, 467.5199890136719, 399.03997802734375], [208.95999145507812, 285.1199951171875, 346.55999755859375, 463.67999267578125], [15.039999961853027, 254.39999389648438, 214.0800018310547, 465.5999755859375], [463.67999267578125, 221.1199951171875, 636.47998046875, 406.0799865722656]], "labels": ["6 of hearts", "7 of hearts", "8 of hearts", "5 of hearts"]}}

{"<OD>": {"bboxes": [[310.0799865722656, 424.0, 548.7999877929688, 562.239990234375], [98.23999786376953, 433.5999755859375, 314.55999755859375, 563.5199584960938], [10.559999465942383, 227.51998901367188, 275.5199890136719, 429.1199951171875], [291.5199890136719, 175.0399932861328, 625.5999755859375, 401.6000061035156]], "labels": ["9 of clubs", "5 of clubs", "7 of clubs", "8 of clubs"]}}

CPU times: user 14min 58s, sys: 5min 54s, total: 20min 53s
Wall time: 19min 49s

Fine-tuned model evaluation

# @title Check if the model can still detect objects outside of the custom dataset

image = Image.open(EXAMPLE_IMAGE_PATH)
task = "<OD>"
text = "<OD>"

inputs = processor(text=text, images=image, return_tensors="pt").to(DEVICE)
generated_ids = peft_model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
response = processor.post_process_generation(generated_text, task=task, image_size=(image.width, image.height))
detections = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, response, resolution_wh=image.size)

bounding_box_annotator = sv.BoundingBoxAnnotator(color_lookup=sv.ColorLookup.INDEX)
label_annotator = sv.LabelAnnotator(color_lookup=sv.ColorLookup.INDEX)

image = bounding_box_annotator.annotate(image, detections)
image = label_annotator.annotate(image, detections)
image.thumbnail((600, 600))
image
BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.

NOTE: It seems that the model can still detect classes that don’t belong to our custom dataset.

# @title Collect predictions

PATTERN = r'([a-zA-Z0-9 ]+ of [a-zA-Z0-9 ]+)<loc_\d+>'

def extract_classes(dataset: DetectionDataset):
    class_set = set()
    for i in range(len(dataset.dataset)):
        image, data = dataset.dataset[i]
        suffix = data["suffix"]
        classes = re.findall(PATTERN, suffix)
        class_set.update(classes)
    return sorted(class_set)

CLASSES = extract_classes(train_dataset)

targets = []
predictions = []

for i in range(len(val_dataset.dataset)):
    image, data = val_dataset.dataset[i]
    prefix = data['prefix']
    suffix = data['suffix']

    inputs = processor(text=prefix, images=image, return_tensors="pt").to(DEVICE)
    generated_ids = model.generate(
        input_ids=inputs["input_ids"],
        pixel_values=inputs["pixel_values"],
        max_new_tokens=1024,
        num_beams=3
    )
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]

    prediction = processor.post_process_generation(generated_text, task='<OD>', image_size=image.size)
    prediction = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, prediction, resolution_wh=image.size)
    prediction = prediction[np.isin(prediction['class_name'], CLASSES)]
    prediction.class_id = np.array([CLASSES.index(class_name) for class_name in prediction['class_name']])
    prediction.confidence = np.ones(len(prediction))

    target = processor.post_process_generation(suffix, task='<OD>', image_size=image.size)
    target = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, target, resolution_wh=image.size)
    target.class_id = np.array([CLASSES.index(class_name) for class_name in target['class_name']])

    targets.append(target)
    predictions.append(prediction)
# @title Calculate mAP
# mean_average_precision = sv.MeanAveragePrecision.from_detections(
#     predictions=predictions,
#     targets=targets,
# )
mean_average_precision = sv.metrics.MeanAveragePrecision().update(predictions, targets).compute()

print(f"map50_95: {mean_average_precision.map50_95:.2f}")
print(f"map50: {mean_average_precision.map50:.2f}")
print(f"map75: {mean_average_precision.map75:.2f}")
map50_95: 0.30
map50: 0.32
map75: 0.32
p = sv.metrics.Precision()
p = p.update(predictions, targets).compute()
print(p.precision_at_50)

r = sv.metrics.Recall()
r = r.update(predictions, targets).compute()
print(r.recall_at_50)
0.5440355329949238
0.3553299492385787
invalid value encountered in divide
# @title Calculate Confusion Matrix
confusion_matrix = sv.ConfusionMatrix.from_detections(
    predictions=predictions,
    targets=targets,
    classes=CLASSES
)

_ = confusion_matrix.plot()

Save fine-tuned model on hard drive

peft_model.save_pretrained("/content/florence2-lora")
processor.save_pretrained("/content/florence2-lora/")
!ls -la /content/florence2-lora/
---------------------------------------------------------------------------
PermissionError                           Traceback (most recent call last)
Cell In[25], line 1
----> 1 peft_model.save_pretrained("/content/florence2-lora")
      2 processor.save_pretrained("/content/florence2-lora/")
      3 get_ipython().system('ls -la /content/florence2-lora/')

File /opt/anaconda3/envs/zeel_py310/lib/python3.10/site-packages/peft/peft_model.py:320, in PeftModel.save_pretrained(self, save_directory, safe_serialization, selected_adapters, save_embedding_layers, is_main_process, path_initial_model_for_weight_conversion, **kwargs)
    317     return output_state_dict
    319 if is_main_process:
--> 320     os.makedirs(save_directory, exist_ok=True)
    321     self.create_or_update_model_card(save_directory)
    323 for adapter_name in selected_adapters:

File /opt/anaconda3/envs/zeel_py310/lib/python3.10/os.py:215, in makedirs(name, mode, exist_ok)
    213 if head and tail and not path.exists(head):
    214     try:
--> 215         makedirs(head, exist_ok=exist_ok)
    216     except FileExistsError:
    217         # Defeats race condition when another thread created the path
    218         pass

File /opt/anaconda3/envs/zeel_py310/lib/python3.10/os.py:225, in makedirs(name, mode, exist_ok)
    223         return
    224 try:
--> 225     mkdir(name, mode)
    226 except OSError:
    227     # Cannot rely on checking for EEXIST, since the operating system
    228     # could give priority to other errors like EACCES or EROFS
    229     if not exist_ok or not path.isdir(name):

PermissionError: [Errno 13] Permission denied: '/content'

Upload model to Roboflow (optional)

You can deploy your Florence-2 object detection model on your own hardware (i.e. a cloud GPu server or an NVIDIA Jetson) with Roboflow Inference, an open source computer vision inference server.

To deploy your model, you will need a free Roboflow account.

To get started, create a new Project in Roboflow if you don’t already have one. Then, upload the dataset you used to train your model. Then, create a dataset Version, which is a snapshot of your dataset with which your model will be associated in Roboflow.

You can read our full Deploy Florence-2 with Roboflow guide for step-by-step instructions of these steps.

Once you have trained your model A, you can upload it to Roboflow using the following code:

import roboflow

rf = Roboflow(api_key="API_KEY")
project = rf.workspace("workspace-id").project("project-id")
version = project.version(VERSION)

version.deploy(model_type="florence-2", model_path="/content/florence2-lora")

Above, replace:

If you are not using our notebook, replace /content/florence2-lora with the directory where you saved your model weights.

When you run the code above, the model will be uploaded to Roboflow. It will take a few minutes for the model to be processed before it is ready for use.

Your model will be uploaded to Roboflow.

Deploy to your hardware

Once your model has been processed, you can download it to any device on which you want to deploy your model. Deployment is supported through Roboflow Inference, our open source computer vision inference server.

Inference can be run as a microservice with Docker, ideal for large deployments where you may need a centralized server on which to run inference, or when you want to run Inference in an isolated container. You can also directly integrate Inference into your project through the Inference Python SDK.

For this guide, we will show how to deploy the model with the Python SDK.

First, install inference:

!pip install inference

Then, create a new Python file and add the following code:

import os
from inference import get_model
from PIL import Image
import json

lora_model = get_model("model-id/version-id", api_key="KEY")

image = Image.open("containers.png")
response = lora_model.infer(image)
print(response)

In the code avove, we load our model, run it on an image, then plot the predictions with the supervision Python package.

When you first run the code, your model weights will be downloaded and cached to your device for subsequent runs. This process may take a few minutes depending on the strength of your internet connection.

Congratulations

⭐️ If you enjoyed this notebook, star the Roboflow Notebooks repo (and supervision while you’re at it) and let us know what tutorials you’d like to see us do next. ⭐️